Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve simplified Chinese text search result #37

Merged
merged 1 commit into from
May 4, 2022

Conversation

aidenlx
Copy link
Contributor

@aidenlx aidenlx commented Apr 30, 2022

In the new tokenize function, it will detect if another Chinese word segmenter plugin is enabled or not, and use it to extend tokenize result when there are Chinese characters exist in token text.

this PR should solve #33

in cm-chs-patch, it has an option to use a wasm segmenter module in favor system segmenter, which produces terrible index result for some keywords

@scambier
Copy link
Owner

Hello, and thank you very much for that PR! I'll merge it asap.

@scambier scambier merged commit 7fcac79 into scambier:master May 4, 2022
@scambier scambier added this to the 1.3 milestone May 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging this pull request may close these issues.

2 participants